8 research outputs found

    Trident: Controlling Side Effects in Automated Program Repair

    Get PDF
    The goal of program repair is to eliminate a bug in a given program by automatically modifying its source code. The majority of real-world software is written in imperative programming languages. Each function or expression in imperative code may have side effects, observable effects beyond returning a value. Existing program repair approaches have a limited ability to handle side effects. Previous test-driven semantic repair approaches only synthesise patches without side effects. Heuristic repair approaches generate patches with side effects only if suitable code fragments exist in the program or a database of repair patterns, or can be derived from training data. This work introduces Trident, the first test-driven program repair approach that synthesizes patches with side effects without relying on the plastic surgery hypothesis, a database of patterns, or training data. Trident relies on an interplay of several parts. First, it infers a specification for synthesising side-effected patches using symbolic execution with a custom state merging strategy that alleviates path explosion due to side effects. Second, it uses a novel component-based patch synthesis approach that supports lvalues, values that appear on the left-hand sides of assignments. In an evaluation on open-source projects, Trident successfully repaired 6 out of 10 real bugs that require insertion of new code with side effects, which previous techniques do not therefore repair. Evaluated on the ManyBugs benchmark, Trident successfully repaired two new bugs that previous approaches could not. Adding patches with side effects to the search space can exacerbate test-overfitting. We experimentally demonstrate that the simple heuristic of preferring patches with the fewest side effects alleviates the problem. An evaluation on a large number of smaller programs shows that this strategy reduces test-overfitting caused by side-effects, increasing the rate of correct patches from 33.3% to 58.3%

    Use of General Repair Tool for Fixing Security Vulnerabilities

    Get PDF
    Automated patch generation approaches have been shown to address defects in real-world programs, including security vulnerabilities. On the one hand, general repair tools are designed to fix common bugs. On the other hand, specific repair tools targeted security-related vulnerabilities, such as integer or buffer overflow. However, fewer works focus on assessing general repair tools' capabilities to fix security vulnerabilities. The assessment will be helpful to find out if general repair tools can fix security vulnerabilities without knowing specific characterizing patterns of such vulnerabilities in advance, thus not being subjected to overfitting. In this paper, we present a detailed analysis of a case study using the semantic general repair tool, F1X, to fix security-related vulnerabilities found by the OSS-Fuzz framework. OSS-Fuzz framework is an automated continuous testing platform run on Google's cloud infrastructure, which can pinpoint source code containing security-related vulnerability and generate its failing test case. Using a dataset of 240 security vulnerabilities found in five open source programs from OSS-Fuzz, we compared and analyzed fix patterns generated by F1X with fixing patterns in OSS-Fuzz Github repositories. We believe that the result of this case study will be insightful for developers to strengthen and optimize their repair tools and security analysts to consider integrating automated repair tools into production systems

    Evaluating Automatic Program Repair Capabilities to Repair API Misuses

    Get PDF
    API misuses are well-known causes of software crashes and security vulnerabilities. However, their detection and repair is challenging given that the correct usages of (third-party) APIs might be obscure to the developers of client programs. This paper presents the first empirical study to assess the ability of existing automated bug repair tools to repair API misuses, which is a class of bugs previously unexplored. Our study examines and compares 14 Java test-suite-based repair tools (11 proposed before 2018, and three afterwards) on a manually curated benchmark (APIREPBENCH) consisting of 101 API misuses. We develop an extensible execution framework (APIARTY) to automatically execute multiple repair tools. Our results show that the repair tools are able to generate patches for 28% of the API misuses considered. While the 11 less recent tools are generally fast (the median execution time of the repair attempts is 3.87 minutes and the mean execution time is 30.79 minutes), the three most recent are less efficient (i.e., 98% slower) than their predecessors. The tools generate patches for API misuses that mostly belong to the categories of missing null check, missing value, missing exception, and missing call. Most of the patches generated by all tools are plausible (65%), but only few of these patches are semantically correct to human patches (25%). Our findings suggest that the design of future repair tools should support the localisation of complex bugs, including different categories of API misuses, handling of timeout issues, and ability to configure large software projects. Both APIREPBENCH and APIARTY have been made publicly available for other researchers to evaluate the capabilities of repair tools on detecting and fixing API misuses

    Test-Equivalence Analysis for Automatic Patch Generation

    Get PDF
    Automated program repair is a problem of finding a transformation (called a patch) of a given incorrect program that eliminates the observable failures. It has important applications such as providing debugging aids, automatically grading student assignments, and patching security vulnerabilities. A common challenge faced by existing repair techniques is scalability to large patch spaces, since there are many candidate patches that these techniques explicitly or implicitly consider. The correctness criteria for program repair is often given as a suite of tests. Current repair techniques do not scale due to the large number of test executions performed by the underlying search algorithms. In this work, we address this problem by introducing a methodology of patch generation based on a test-equivalence relation (if two programs are “test-equivalent” for a given test, they produce indistinguishable results on this test). We propose two test-equivalence relations based on runtime values and dependencies, respectively, and present an algorithm that performs on-the-fly partitioning of patches into test-equivalence classes. Our experiments on real-world programs reveal that the proposed methodology drastically reduces the number of test executions and therefore provides an order of magnitude efficiency improvement over existing repair techniques, without sacrificing patch quality

    Re-factoring based program repair applied to programming assignments

    Get PDF
    Automated program repair has been used to provide feedback for incorrect student programming assignments, since program repair captures the code modification needed to make a given buggy program pass a given test-suite. Existing student feedback generation techniques are limited because they either require manual effort in the form of providing an error model, or require a large number of correct student submissions to learn from, or suffer from lack of scalability and accuracy. In this work, we propose a fully automated approach for generating student program repairs in real-time. This is achieved by first re-factoring all available correct solutions to semantically equivalent solutions. Given an incorrect program, we match the program with the closest matching refactored program based on its control flow structure. Subsequently, we infer the input-output specifications of the incorrect program's basic blocks from the executions of the correct program's aligned basic blocks. Finally, these specifications are used to modify the blocks of the incorrect program via search-based synthesis. Our dataset consists of almost 1,800 real-life incorrect Python program submissions from 361 students for an introductory programming course at a large public university. Our experimental results suggest that our method is more effective and efficient than recently proposed feedback generation approaches. About 30% of the patches produced by our tool Refactory are smaller than those produced by the state-of-art tool Clara, and can be produced given fewer correct solutions (often a single correct solution) and in a shorter time. We opine that our method is applicable not only to programming assignments, and could be seen as a general-purpose program repair method that can achieve good results with just a single correct reference solution

    On the Time Performance of Automated Fixes

    No full text
    Automated program repair has made major strides showing its exciting potential, but all efforts to turn the techniques into practical tools usable by software developers hit a crucial blocking factor: the timing issue. Today???s techniques are slow. Too slow by an order of magnitude at least. The long response time implies that the currently available techniques cannot suit the actual needs of developers in the field. What developers want is a tool that can instantaneously propose a fix for a detected failure. A technique that can propose a patch instantaneously (or near-instantaneously) would provide the breakthrough that is required to turn automated program repair from an attractive research topic into a practical software engineering tool. Indeed, researchers have started to tackle this speed issue. In this paper, we survey recent approaches that were shown to be effective in speeding up automated program repair. In particular, we view automated program repair as a search problem???the ultimate goal of automated program repair is the search for a patch which is often preceded by other related searches such as a search for suspicious program locations, and a search for the specification for a patch. We describe how the problem of automated program repair has been decomposed into a series of search problems, and explain how these individual search problems have been solved. We expect that our paper would provide insight into how to speed up automated program repair by further optimizing the search for a patch

    Semi-supervised Verified Feedback Generation

    No full text
    Students have enthusiastically taken to online programming lessons and contests. Unfortunately, they tend to struggle due to lack of personalized feedback. There is an urgent need of program analysis and repair techniques capable of handling both the scale and variations in student submissions, while ensuring quality of feedback. Towards this goal, we present a novel methodology called semi-supervised verified feedback generation. We cluster submissions by solution strategy and ask the instructor to identify or add a correct submission in each cluster. We then verify every submission in a cluster against the instructor-validated submission in the same cluster. If faults are detected in the submission then feedback suggesting fixes to them is generated. Clustering reduces the burden on the instructor and also the variations that have to be handled during feedback generation. The verified feedback generation ensures that only correct feedback is generated. We implemented a tool, named CoderAssist, based on this approach and evaluated it on dynamic programming assignments. We have designed a novel counter-example guided feedback generation algorithm capable of suggesting fixes to all faults in a submission. In an evaluation on 2226 submissions to 4 problems, CoderAssist could generate verified feedback for 1911 (85%) submissions in 1.6s each on an average. It does a good job of reducing the burden on the instructor. Only one submission had to be manually validated or added for every 16 submissions
    corecore